Component Description | supOS Industrial Operating System

Version: 4.0

Data Input/Output

Input

Definition

Select a sample set as the data source of the algorithm.

Configuration

Click Slect object, and then select a sample set.

Insight

Data list of the sample set under Output data.
Statistical analysis of the sample set fields, including attribute, type, number of missing values, maximum, minimum and more.

Output

Definition

The end of the experiment.

Configuration

Select output object. Normally it is set to the prediction results of the algorithm. You can also select dependent and independent variables.

Insight

None

Data Preprocessing

Char Conversion

Definition

Convert string type of fields in the sample set to numerical type based on the configuration.

Configuration

Click Select object, and then select fields to convert.
Select a conversion type.
Set the conversion rules.

Insight

None

Data Balancing

Definition

Adjust data type distribution by increasing or decresing the number of minority or majority samples with algorithms such as SMOTE, ADASYN, Tomek Link and Near Miss.

Configuration

Select an enumeration field, and then select a balancing algorithm and set its related parameters.

Parameter	Description
Balancing method	Select a balancing algorithm. SMOTE/ADASYN: Add data to the minority samples based on the existing data to balance data in each category. Tomek Link/Near Miss: Reduce data from the majority samples to balance data in each category.
Sampling method	Select a method to set the oversampling or undersampling field. Quantity category: Select a sample category based on its quantity. For example, select Non-minimum class to add or deduct data to/from the category other than the minimum for volume balance. Sample size: Select a category, and manually set the target size to balance each category. Sample proportion: Select a category, and manually set the amplitude/reduction percentage of the category valome. For Tomek Link, you can set a category and reduce its data volume.
K-Nearest Neighbors	Set the number of KNN for algorithms to calculate.
Random seed	Enable it to fix the amplitude/reduction value for repeated debugging.

Insight

Used SMOTE to oversmaple the Non-minimum category.

Data Copy

Definition

Copy the input data to multiple output nodes.

Configuration

None

Insight

Depends on the preprocessing elements. Without preprocessing, the output data is the same as the input data.

Data Keep

Definition

Keep the data unchanged for the set execution cycle.

Configuration

Enable Data cache.
Click Select object, and then select comparison fields.
Click Config, and then select fields to be kept with the Comparison field.

info

ONly numerical fields are available.

Insight

Data Merge

Definition

Merge multiple input data sets into one.

Configuration

Enable Auto handle duplicate field names, and duplicate field name will add a suffix 1.

tip

Under Node info, view the input and output data structure.

Insight

Merged data set.

Data Split

Definition

Split the original data set into Training set and Validation set. Used for regression or classifier algorithms.

Configuration

Set the ratio between trainning set and validation set, and during each debugging, the actual ratio will be slightly different from the set value.
Select whether to enable Random seed.

Enable Random seed, repeat debugging the experiment and the same ratio as the last time will be used again.
Disable Random seed, repeat debugging the experiment and the ratio will be slightly different each time.

info

For example, the ratio is set to 80/20.

Enable Random seed, the ratio of the first debugging may be 80.21%/19.79%, and the second debugging ratio will still be 80.21%/19.79%.
Disable Random seed, the ratio of the first debugging may be 80.21%/19.79%, and the second debugging ratio may be 79.91%/20.09%.

Insight

Displays the data of each set. Only the first 50 entries are displayed.

Data Filter

Definition

Eliminates the error data to improve data clearance and machine learning effect.

Configuration

Select fields for filtering.
Select a filter and set the filter window size.
info
When selecting EMA, you can either use adaptive weight or manually set the weight. For manual set, keep 2 decimal places.

Insight

View the effect with the Tendency chart element.

Lag Process

Definition

Applies different lag cycle to the selected feature field to analyze the time serial difference.

Configuration

Click Config, and then select the feature field and set the lag cycle.

Set lag cycle to a positive integer, the feature field is in advance.
Set lag cycle to a negative integer, the feature field is left behind.
Set lag cycle to 0, the feature field is at the same pace as other fields.

caution

The lag cycle cannot be larger than the number of data rows.

Insight

Missing Values

Definition

Replace the missing value of the input sample set.

Configuration

Click Select object, select fields that have missing values and set the processing methods.

Insight

Normalization

Definition

Limit the input data set within a certain value range, normally [0, 1], to get better effect from subsequent modeling algorithms with less running time.

Configuration

Click Select object, select fields that needs to be normalized and set the processing methods.

Insight

Num Conversion

Definition

Convert numerical type of fields in the sample set to string type based on the configuration.

Configuration

Click Select object, select fields that needs to be converted.

Insight

Same values with string as data type.

Set Role

Definition

Define dependent and independent variables from the sample set, and subsequent algorithms can get clear data features and corresponding relations based on the role.

Configuration

Click Select object, select fields and set them to independent and dependent variables.

Insight

Step Limit

Definition

Set difference limit between fields. Normally used when connecting with DCS and involving in control process to keep the process steady by limiting the change.

Configuration

Click Select object, select fields and set them to independent and dependent variables.

Parameter	Description
Comparison field	Select fields to be the compared objects.
Step deadband	Buffer value between the difference of values of comparison field and original field, and the step limit value.
Step limit	Limit of the difference between values of comparison field and original field. The value must be larger than the step deadband.

info

An example to demonstrate the relation betweend step deadband and limit.

The orginal field value is 10, comparison field value is 3, step limit is 3, step deadband is 1. After processing, the original field will be 6 (3 + 3), beacuse the difference 7 > 3.
The orginal field value is 10, comparison field value is 3, step limit is 10, step deadband is 8. After processing, the original field will be 3, beacuse the difference 7 < 8.
The orginal field value is 10, comparison field value is 3, step limit is 8, step deadband is 5. After processing, the original field will be 10, beacuse the difference 5 < 7 < 8.

Insight

Signal Processing

Digital Filter

Definition

Filter the discrete input signals according to the set filter frequency.

Configuration

Click Select object, select fields and set the filter, frequency and type.

Parameter	Description
Batch Filter	Select filters from FIR filter and IIR filter.
Order	The higher the order, the more complicated and accurate the frequency response of the filter is.
Sampling frequency	The number of signal sampling every second.
Filter type	Low pass: Allows only low-frequency signals to pass. High pass: Allows only high-frequency signals to pass. Band pass: Allows only specified-frequency signals to pass. Band stop: Allows only signals outside the specified-frequency range to pass.
Window function	Adjust and restrict the frequency response of the filter. info Only available for FIR filter.
Coefficient normalization	Zoom the filter coefficient to a specified range to assure the filter steadiness. info Only available for FIR filter.
Cutoff frequency	The frequency where the filter stops passing signals.
Standard deviation	Controls the window function shape and spectral characteristics. The larger the number, the wider the Guassian window.
Maximum ripple	The maximum amplitude of Chebyshev I filter within the band-pass frequency range. info Unavailable for Butterworth filter.
Minimum attenuation	The minimum attenuation of Chebyshev I filter within the band-stop frequency range. info Unavailable for Butterworth filter.

Insight

Wavelet Transform

Definition

Process the wavelet coefficient of signals with noise to reserve and restore real signals.

Configuration

Click Select object, select fields and set the filter parameters.

Parameter	Description
Wavelet filter	Select a wavelet filter to decompose and recompose signals.
Decomposition depth	The level of decomposing signals.
Boundary processing	The way of processing boundry signals. periodic: Periodically copy or extend boundary signals. reflection: Extend boundary signals symmetrically.

Insight

Feature Engineering

Correlation Matrix

Definition

Calculate the coorelation coefficient between each feature in a data set based on the set algorithm.

Configuration

Click Select object, select fields and algorithm type.

Insight

Feature Expression

Definition

Create new variables by calculating original values.

Configuration

Click Select object, set feature name and data type.
Select a field and click Reference.
Set the expression using the on-screen keyboard.
Click Verify to check whether the expression is correct.
Click Save and then click OK.

Insight

Feature Importance

Definition

Evaluate the influence that features have on prediction results in machine learning models.

info

Provided models are based on decision trees, including random forest and gradient boosting decision trees.

Configuration

MUST Do

Make sure you add a Set Role component before Feature Importance.

Random forest model parameter description
Parameter	Description
Learner type	Regression: Predict continuous numerical values. Classification: Predict discrete class labels.
Feature evaluation criteria	Regression: MSE and Average absolute error. Classification: Gini and Entropy.
Maximum number of iteration	The maximum number of iterations or the maximum number of decision trees in the random forest ensemble.
Maximum feature	The maximum number of features considered during the construction of each decision tree in the ensemble.
Maximum depth	The limit set on the maximum depth or number of levels of each decision tree in the ensemble.
Minimum sample size of leaf nodes	Specifies the minimum number of samples required to consider further splitting a node and form a leaf node.
Minimum sample size for splitting	Specifies the minimum number of samples needed to consider splitting a node further.
Random seed	Enable it to fix the amplitude/reduction value for repeated debugging.

GBDT model parameter description
Parameter	Description
Step length	Controls the contribution of each decision tree to the overall model.
Subsampling	The technique of randomly selecting a subset of training samples during each iteration.
Loss function	Regression: Mean square error, Absolute loss, Huber loss and Quantile loss. Classification: Log-likelihood loss function and Exponential loss function.

Insight

Outlier Detection

Definition

Detect abnormal values based on feature distribution of original values. Abnormal value processing is also available.

Configuration

Click Select object, select fields and corresponding detection mode.

Insight

Principal Component Analysis

Definition

Convert multiple variables to a few principal components that contain most information and are non-linear related.

Configuration

Click Select object, select fields and set PCA components.

Insight

Scatter Plot

Definition

Displays data points on a 2D or 3D coordinate system.

Configuration

Click Select object, select chart type and corresponding fields, and the display color scheme on the coordinate system.

info

Select Group to group fields, and the group field must be string.

Insight

Tendency Chart

Definition

Displays data tendency.

Configuration

Click Select object, select fields and the display color scheme on the tendency chart.

Insight

Machine learning

MUST Do

Make sure you add a Set Role component before machine learning operators.

Regression-BPNN Regression

Definition

Use neural network to build the non-linear mapping relation between input features and output, and then use the relation to predict target variables.

Configuration

Parameter	Description
Data normalization	To ensure that the input features are within a similar range and have comparable scales.
Hidden neuron	Determines the complexity and representational capacity of the network.
Weight decay	Prevent overfitting and improve generalization performance by adding a regularization term to the loss function during training, which penalizes large weight values.
Maximum number of iterations	The maximum number of training iterations or epochs that the algorithm will go through during the training process.
Prediction interval control	Manage the uncertainty or confidence associated with the predictions made by the model.

Insight

Regression-Curve Regression

Definition

Fit the input non-linear data to build models and predict data.

Configuration

Select one independent variable and one dependent variable in Set Role, and then set the order.

Insight

Regression-Decision Tree Regression

Definition

A regression analysis method based on decision trees. It uses a tree-like structure to model and predict continuous numerical outputs.

Configuration

Parameter	Description
Maximum depth	The maximum number of levels or node depths allowed in the decision tree model.
Information metrics	Only supports variance, meaning the squared difference between real value and predicted value. The smaller the calculated value, the better the model is fitted.
Pruning coefficient	Reduces the complexicy of the decision tree. The larger the value, the simpler the tree gets. caution The value should be set based on data set features and actual requirements. Underfitting or overfitting might happen due to unreasonable coefficient setting.
Minimum sample size of leaf nodes	Specifies the minimum number of samples required to consider further splitting a node and form a leaf node.

Insight

Regression-DNN Regression

Definition

Utilizes deep neural networks to predict continuous numerical values. The model is trained on a labeled dataset consisting of input features and corresponding target values.

Configuration

Parameter	Description
Data normalization	Scales the feature values of input data to a suitable range, making the training process of the neural network more stable and efficient.
Prediction interval control	Estimates the confidence intervals or variance ranges of the predicted values, providing more comprehensive information about the prediction results.
Number of hidden layers	Layers between the input layer and the output layer, where the actual learning and feature extraction occur.
Output layer activation function	The linear activation function simply outputs the weighted sum of the input without applying any non-linear transformation.
Number of batch training	The number of samples used in each batch during the training of a DNN regression model.
Number of iterations	The number of times the entire training dataset is processed by the neural network during the training process.
Validation data ratio	Represents the fraction of the total dataset that is set aside for validation during the model training process.
Optimizer	Determines how the model's parameters (weights and biases) are updated during the training process to minimize the loss function and improve the model's performance.
Loss function	Measures the discrepancy between the predicted output and the actual target values.
Regularization parameter	Helps prevent overfitting by adding a penalty term to the loss function, discouraging the model from becoming too complex and relying too much on the training data.
Kernel weight	The weights represent the strength of the connections between neurons in different layers.
Bias weight	Used to introduce an offset or bias to the output of the neuron.

Insight

Regression-Linear Regression

Definition

A statistical modeling technique for predicting a continuous numerical output variable based on one or more input features. It assumes a linear relationship between the input features and the target variable.

Configuration

Parameter	Description
Regularization parameter	Prevents overfitting and improve the generalization ability of the model. It adds a regularization term to the loss function during training, which penalizes large coefficient values. The regularization parameter controls the strength of this penalty.
Penalty function type	L1: Adds a penalty term that is proportional to the sum of the absolute values of the coefficients to the loss function. L2: Adds a penalty term that is proportional to the sum of squared coefficients to the loss function. L1+L2: Adds a penalty term to the loss function that is a linear combination of the L1 and L2 penalty terms.

Insight

Regression-LSTM Regression

Definition

Uses LSTM neural networks to predict continuous numeric values, which makes it well-suited for time series forecasting and regression tasks.

Configuration

Parameter	Description
Data normalization	Scales the feature values of input data to a suitable range, making the training process of the neural network more stable and efficient.
Prediction interval control	Estimates the confidence intervals or variance ranges of the predicted values, providing more comprehensive information about the prediction results.
Time step	The number of past observations or data points that are considered as input to the LSTM network when making a prediction for the next time step.
Number of LSTM layers	The depth of the LSTM architecture used in the model. When the dataset is small and noisy, it is better to set a smaller number.
Output layer activation function	The linear activation function simply outputs the weighted sum of the input without applying any non-linear transformation.
Number of batch training	The number of samples used in each batch during the training of a DNN regression model.
Number of iterations	The number of times the entire training dataset is processed by the neural network during the training process.
Validation data ratio	Represents the fraction of the total dataset that is set aside for validation during the model training process.
Optimizer	Determines how the model's parameters (weights and biases) are updated during the training process to minimize the loss function and improve the model's performance.
Loss function	Measures the discrepancy between the predicted output and the actual target values.
Regularization parameter	Helps prevent overfitting by adding a penalty term to the loss function, discouraging the model from becoming too complex and relying too much on the training data.
Kernel weight	The weights represent the strength of the connections between neurons in different layers.
Recurrent weight	The set of weights that control the recurrent connections within the LSTM cells.
Bias weight	Used to introduce an offset or bias to the output of the neuron.

Insight

Regression-Nolinear SVM Regression

Definition

Find a nonlinear function that can map the input features to the target variable.

Configuration

Parameter	Description
Kernal function type	Maps the input data into a higher-dimensional space, where it becomes more suitable for linear regression.
Kernal function coefficient	Controls the shape and flexibility of the kernel function.
Penalty factor C	Controls the trade-off between maximizing the margin and minimizing the training error.
Error accuracy	The accuracy of the regression model in making predictions on the test or validation data.
Maximum number of iterations	Specifies the maximum number of times the optimization algorithm will update the model's parameters while trying to minimize the cost function or loss.

Insight

Regression-Random Forest Regression

Definition

Combines the predictions from multiple decision trees to generate a final prediction.

Configuration

Parameter	Description
Number of trees	The number of decision trees that are built during the training process.
Maximum depth	The maximum number of levels or node depths allowed in the decision tree model.
Info metrics	MSE: Measures the average squared difference between the predicted values and the actual target values. MAE: Measures the average absolute difference between the predicted values and the actual target values.
Maximum feature	The maximum number of features considered during the construction of each decision tree in the ensemble.
Minimum sample size of leaf nodes	Specifies the minimum number of samples required to consider further splitting a node and form a leaf node.
Cost-complexity pruning	Ensures that each decision tree in the Random Forest is already relatively simple and less prone to overfitting.

Insight

Regression-RNN Regression

Definition

Utilizes recurrent neural networks to model and predict continuous numerical values.

Configuration

Parameter	Description
Data normalization	Scales the feature values of input data to a suitable range, making the training process of the neural network more stable and efficient.
Prediction interval control	Estimates the confidence intervals or variance ranges of the predicted values, providing more comprehensive information about the prediction results.
Time step	The number of past observations or data points that are considered as input to the LSTM network when making a prediction for the next time step.
Number of LSTM layers	The depth of the LSTM architecture used in the model. When the dataset is small and noisy, it is better to set a smaller number.
Output layer activation function	The linear activation function simply outputs the weighted sum of the input without applying any non-linear transformation.
Number of batch training	The number of samples used in each batch during the training of a DNN regression model.
Number of iterations	The number of times the entire training dataset is processed by the neural network during the training process.
Validation data ratio	Represents the fraction of the total dataset that is set aside for validation during the model training process.
Optimizer	Determines how the model's parameters (weights and biases) are updated during the training process to minimize the loss function and improve the model's performance.
Loss function	Measures the discrepancy between the predicted output and the actual target values.
Regularization parameter	Helps prevent overfitting by adding a penalty term to the loss function, discouraging the model from becoming too complex and relying too much on the training data.
Kernel weight	The weights represent the strength of the connections between neurons in different layers.
Recurrent weight	The set of weights that control the recurrent connections within the LSTM cells.
Bias weight	Used to introduce an offset or bias to the output of the neuron.

Insight

Regression-XGB Regression

Definition

Combines the predictions of multiple weak learners (decision trees) to create a powerful ensemble model.

Configuration

Parameter	Description
Weak estimator	A variant of decision trees that focuses on minimizing the gradients of the loss function during training. Each weak estimator is trained to correct the errors made by the previous estimators, gradually improving the model's predictions.
Objective function	The loss function to be minimized during the training process. It quantifies the discrepancy between the predicted values and the true target values, guiding the optimization of the XGBoost regression model.
Maximum depth	The maximum number of levels or node depths allowed in the decision tree model.
Learning rate	Controls the contribution of each weak learner to the overall ensemble model.
Minimum loss of leaf nodes	Control the minimum amount of improvement required to split a leaf node further. It determines whether a further split at a leaf node will be considered based on the reduction in the loss function achieved by the split.
Leaf node weight	The values assigned to the leaf nodes of each decision tree in the ensemble.
Maximum iteration step	Specifies the maximum depth or number of levels allowed for each decision tree.
L1/2 regular parameters	Control the strength of L1 and L2 regularization techniques used in the model. Which prevents overfitting and improve the generalization ability of the XGBoost regression model by adding penalty terms to the loss function.
Number of iterations	Determines the maximum number of decision trees to be built during the training process.

Insight

Classification-BPNN Classification

Definition

Learns to map input data to the corresponding target classes based on a labeled training dataset.

Configuration

Parameter	Description
Hidden neuron	Represents the number of neurons in the hidden layer of the neural network, which is a crucial component that allows the network to learn complex representations and patterns from the input data.
Weight decay	Prevents overfitting and improve the generalization ability of the model.
Maximum number of iterations	Sets an upper limit on the number of times the training algorithm will update the model's weights.

Insight

Classification-CART Decision Tree Classification

Definition

A non-parametric algorithm that builds a binary tree structure by recursively splitting the data based on the feature that provides the best separation of the target classes.

Configuration

Parameter	Description
Maximum depth	The maximum allowed depth of a tree in the algorithm.
Information metrics	Evaluates the quality of a split at each node.
Pruning coefficient	Used in the process of cost-complexity pruning, which is a technique to prevent overfitting and improve the generalization ability of the decision tree.
Maximum sample size of leaf nodes	Specifies the minimum number of samples required to create a leaf node during the tree-building process.

Insight

Classification-ID3 Decision Tree

Definition

Uses the information gain to decide which feature to split on at each node. The information gain measures how much the entropy of the target classes is reduced after the split.

Configuration

Parameter	Description
Maximum depth	The maximum allowed depth of a tree in the algorithm.
Information metrics	Evaluates the quality of a split at each node.
Pruning coefficient	Used in the process of cost-complexity pruning, which is a technique to prevent overfitting and improve the generalization ability of the decision tree.
Maximum sample size of leaf nodes	Specifies the minimum number of samples required to create a leaf node during the tree-building process.

Insight

Classification-KNN Classification

Definition

A non-parametric, lazy learning algorithm that classifies data points based on the majority class of their K nearest neighbors in the feature space.

Configuration

Parameter	Description
Data normalization	A preprocessing step to ensure that all features have a similar scale and range.
Number of nearest neighbors	The value of K, which represents the number of data points or neighbors to consider when making predictions for a new data point.
Measure distance	Calculate the distance between data points in the feature space.

Insight

Classification-Logical Regression

Definition

Used to predict the probability that a data point belongs to a specific class.

Configuration

Parameter	Description
Regularization parameter	Used to control the amount of regularization applied to the logistic regression model.
Penalty function type	L1: Adds a penalty term that is proportional to the sum of the absolute values of the coefficients to the loss function. L2: Adds a penalty term that is proportional to the sum of squared coefficients to the loss function. L1+L2: Adds a penalty term to the loss function that is a linear combination of the L1 and L2 penalty terms.

Insight

Classification-Random Forest Classifier

Definition

An extension of the decision tree algorithm and combines the predictions of multiple decision trees to make more accurate and robust predictions.

Configuration

Parameter	Description
Number of trees	The number of decision trees that are built during the training process.
Maximum depth	The maximum number of levels or node depths allowed in the decision tree model.
Info metrics	Gini: A measure of how often a randomly chosen element from the set would be incorrectly classified if it were randomly classified according to the distribution of labels in the set. Entropy: A measure of the disorder or randomness in the data. It quantifies the uncertainty of the class distribution at a node.
Maximum feature	The maximum number of features considered during the construction of each decision tree in the ensemble.
Minimum sample size of leaf nodes	Specifies the minimum number of samples required to consider further splitting a node and form a leaf node.
Cost-complexity pruning	Ensures that each decision tree in the Random Forest is already relatively simple and less prone to overfitting.

Insight

Classification-XGBoost Classification

Definition

A powerful and popular ensemble learning algorithm used for binary and multiclass classification tasks.

Configuration

Parameter	Description
Weak estimator	A variant of decision trees that focuses on minimizing the gradients of the loss function during training. Each weak estimator is trained to correct the errors made by the previous estimators, gradually improving the model's predictions.
Objective function	The loss function to be minimized during the training process. It quantifies the discrepancy between the predicted values and the true target values, guiding the optimization of the XGBoost regression model.
Maximum depth	The maximum number of levels or node depths allowed in the decision tree model.
Learning rate	Controls the contribution of each weak learner to the overall ensemble model.
Minimum loss of leaf nodes	Control the minimum amount of improvement required to split a leaf node further. It determines whether a further split at a leaf node will be considered based on the reduction in the loss function achieved by the split.
Leaf node weight	The values assigned to the leaf nodes of each decision tree in the ensemble.
Maximum iteration step	Specifies the maximum depth or number of levels allowed for each decision tree.
L1/2 regular parameters	Control the strength of L1 and L2 regularization techniques used in the model. Which prevents overfitting and improve the generalization ability of the XGBoost regression model by adding penalty terms to the loss function.
Number of iterations	Determines the maximum number of decision trees to be built during the training process.

Insight

Cluster-EM

Definition

Used to estimate the parameters of probabilistic models when there is incomplete information or missing data.

Configuration

Parameter	Description
Number of clusters	Represents the number of clusters that the algorithm will attempt to identify in the data.
Maximum number of iterations	The maximum allowed number of iterations that the algorithm can run before it stops.

Insight

Cluster-KMeans

Definition

Used to partition a dataset into K clusters, where each cluster is represented by its centroid (mean point).

Configuration

Parameter	Description
Data standard	The process of scaling the features or variables in the dataset to have a mean of 0 and a standard deviation of 1.
Number of clusters	Determines the number of clusters the algorithm will attempt to identify in the data.
Maximum number of iterations	The maximum allowed number of iterations that the algorithm can run before it stops.
Distance metric type	The Euclidean distance is a measure of the straight-line distance between two data points in a multidimensional space.

Insight

Optimization-Genetic Algorithm

Definition

Used to find approximate solutions to optimization and search problems, especially in cases where the search space is large and complex.

Configuration

Parameter	Description
Value range config	Set the value range of the independent variable.
Population size	Represents the number of potential solutions that are evolved and evaluated in each generation.
Maximum number of iterations	The maximum allowed number of generations that the algorithm can run before it terminates.
Number of repeated tests	How many times the algorithm is executed with different random initializations or settings.
Optimization direction	Whether the algorithm is being used for maximization or minimization of a specific objective function.
Crossover probability	The probability of applying crossover to create new offspring during the reproduction step of the algorithm.
Crossover factor	The proportion of the population that will undergo crossover to produce offspring in the next generation.
Mutation probability	The probability of applying mutation to an individual's genetic information during the reproduction step of the algorithm.

Insight

Custom

Python Script

Definition

Manually write python script to realize function you need.

info

Dependencies can be imported through Env Resources > Packages

Configuration

Parameter	Description
Development note	Displays input/output data and model information which will be used in your script.
Algorithm type setting	Select the output algorithm type from Regression, Classification, Cluster and Other.
Data structure	Click Set output, and then you can either take refrence from the input fields (Reference input) or create new fields (New) to be the output fields.
Python editor	Write your script of both Training and Runtime, and then click Debug to make sure it runs correctly before applying to the model.

Model Assessment

Type	Description
Classification Eval	Evaluate classification model precision based on the applied model calculation.
Cluster Eval	Evaluate cluster model precision based on the applied model calculation.
Model Apply	Use validation data to test the applied model.
Regression Eval	Evaluate regression model precision based on the applied model calculation and set grading standard.

Data Input/Output​

Input​

Output​

Data Preprocessing​

Char Conversion​

Data Balancing​

Data Copy​

Data Keep​

Data Merge​

Data Split​

Data Filter​

Lag Process​

Missing Values​

Normalization​

Num Conversion​

Set Role​

Step Limit​

Signal Processing​

Digital Filter​

Wavelet Transform​

Feature Engineering​

Correlation Matrix​

Feature Expression​

Feature Importance​

Outlier Detection​

Principal Component Analysis​

Scatter Plot​

Tendency Chart​

Machine learning​

Regression-BPNN Regression​

Regression-Curve Regression​

Regression-Decision Tree Regression​

Regression-DNN Regression​

Regression-Linear Regression​

Regression-LSTM Regression​

Regression-Nolinear SVM Regression​

Regression-Random Forest Regression​

Regression-RNN Regression​

Regression-XGB Regression​

Classification-BPNN Classification​

Classification-CART Decision Tree Classification​

Classification-ID3 Decision Tree​

Classification-KNN Classification​

Classification-Logical Regression​

Classification-Random Forest Classifier​

Classification-XGBoost Classification​

Cluster-EM​

Cluster-KMeans​

Optimization-Genetic Algorithm​

Custom​

Python Script​

Model Assessment​

Data Input/Output

Input

Output

Data Preprocessing

Char Conversion

Data Balancing

Data Copy

Data Keep

Data Merge

Data Split

Data Filter

Lag Process

Missing Values

Normalization

Num Conversion

Set Role

Step Limit

Signal Processing

Digital Filter

Wavelet Transform

Feature Engineering

Correlation Matrix

Feature Expression

Feature Importance

Outlier Detection

Principal Component Analysis

Scatter Plot

Tendency Chart

Machine learning

Regression-BPNN Regression

Regression-Curve Regression

Regression-Decision Tree Regression

Regression-DNN Regression

Regression-Linear Regression

Regression-LSTM Regression

Regression-Nolinear SVM Regression

Regression-Random Forest Regression

Regression-RNN Regression

Regression-XGB Regression

Classification-BPNN Classification

Classification-CART Decision Tree Classification

Classification-ID3 Decision Tree

Classification-KNN Classification

Classification-Logical Regression

Classification-Random Forest Classifier

Classification-XGBoost Classification

Cluster-EM

Cluster-KMeans

Optimization-Genetic Algorithm

Custom

Python Script

Model Assessment